Superscalar instruction issue

نویسنده

  • Dezsö Sima
چکیده

learly, instruction issue and execution are closely related: The more parallel the instruction execution, the higher the requirements for the parallelism of instruction issue. Thus, we see the continuous and harmonized increase of parallelism in instruction issue and execution. This article focuses on superscalar instruction issue, tracing the way parallel instruction execution and issue have increased performance. It also spans the design space of instruction issue, identifying important design aspects and available design choices. The article also demonstrates a concise way to represent the design space using DS trees (see the related box), reviews the most frequently used issue schemes, and highlights trends for each design aspect of instruction issue. Von Neumann processors evolved by and large in two respects. One reflects the technological improvements, which are capped by increasing clock rates. The second is the functional evolution of processors that came about primarily by raising the degree of par-allelism in internal operations—first of the issue and instruction execution. Processor function evolved in three consecutive phases. First were the traditional von Neumann processors, which are characterized by both sequential issue and sequential instruction execution, as depicted in Figure 1. The chase for more performance then gave rise to the introduction of parallel instruction execution. Designers introduced parallel instruction execution using one of two orthogonal concepts: multiple (non-pipelined) execution units (execution units) or pipelining. As a result, instruction-level parallel (ILP) processors emerged. Because early ILP processors used sequential instruction issue, processors arriving in the second phase of the evolution were scalar ILP processors. Subsequently, the degree of parallel execution rose even further through use of multiple pipelined execution units. While increasing the execution parallelism, designers soon reached the point where sequential instruction issue could no Before choosing parallel instruction issue to increase performance, we must identify the important design aspects and choices. DS trees can help to concisely represent the design space of instruction issue. Traditional von Neuman processors (sequential issue, sequential execution) Scalar ILP processors (sequential issue, parallel execution) Parallelism of instruction execution Parallelism of instruction issue Superscalar ILP processors (parallel issue, parallel execution) Nonpipelined processors Typical implementation Processors with multiple nonpipelined execution units, or pipelined processors VLIW and superscalar procssors embodying multiple pipelined execution units Processor performance Figure 1. Basic evolution phases of von Neumann processors. Thin portions of arrows indicate sequential operation; bold portions indicate parallel operation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Exploration Of Instruction Fetch Requirement In Out-of-order Superscalar Processors

Automated design of superscalar processors can provide future in terms a cycles-per-instruction (CPI) using the application program statistics and the 124, Optimization of Instruction Fetch Mechanisms for High Issue Rates 117, A first-order superscalar processor model Karkhanis, Smith 2004 (Show Context). Because superscalar architectures include complicated control logic for out-of-order execu...

متن کامل

Simultaneous Multithreading – Blending Thread-level and Instruction-level Parallelism in Advanced Microprocessors

The paper discusses the reasons and possibilities of exploiting thread-level parallelism in modern microprocessors. The performance of a superscalar processor suffers when instruction-level parallelism is low. The underutilization due to missing instruction-level parallelism can be overcome by simultaneous multithreading, where a processor can issue multiple instructions from multiple threads e...

متن کامل

Simplifying Hardware for Out Of Order Execution using the Decoupling Paradigm

Future hardware and software technology will try to provide improved performance by extracting higher levels of parallelism. However the cost of a main memory access-in terms of missed instruction issue slots-increases with faster processors and greater issue widths. For this reason latency hiding technology remains one of the most important parts of high performance processor designs. In this ...

متن کامل

Optimum Instruction-level Parallelism (ILP) for Superscalar and VLIW Processors

Modern superscalar and VLIW processors fetch, decode, issue, execute, and retire multiple instructions per cycle. By taking advantage of instruction-level parallelism (ILP), processor performance can be improved substantially. However, increasing the level of ILP may eventually result in diminishing and negative returns due to control and data dependencies among subsequent instructions as well ...

متن کامل

Design of Instruction Address Queue for High Degree X86 Superscalar Architecture

A major hurdle of recent x86 superscalar processor designs is limited instruction issue rate due to the overly complex x86 instruction formats. To alleviate this problem, the machine states must be preserved and the instruction address routing paths must be simplified. We propose an instruction address queue, whose queue size has been estimated to handle saving of instruction addresses with thr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Micro

دوره 17  شماره 

صفحات  -

تاریخ انتشار 1997